Learning TOMs: Towards Non-Myopic Equilibria

نویسندگان

Arjita Ghosh

Sandip Sen

Steven Brams

چکیده

In contrast to classical game theoretic analysis of simultaneous and sequential play in bimatrix games, Steven Brams has proposed an alternative framework called the Theory of Moves (TOM) where players can choose their initial actions and then, in alternating turns, decide to shift or not from its current action. A backward induction process is used to determine a non-myopic action and equilibrium is reached when an agent, on its turn to move, decides to not change its current action. Brams claims that the TOM framework captures the dynamics of a wide range of real-life non-cooperative negotiations ranging over political, historical, and religious disputes. We believe that his analysis is weakened by the assumption that a player has perfect knowledge of the opponent’s payoff. We present a learning approach by which TOM players can learn to converge to Non-Myopic Equilibria (NME) without prior knowledge of its opponent’s preferences and by inducing them from past choices made by the opponent. We present experimental results from all structurally distinct 2-by-2 games without a common preferred outcome showing convergence of our proposed learning player to NMEs. We also discuss the relation between equilibriums in sequential games and NMEs of TOM.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Beyond myopic best response (in Cournot competition)

A Nash Equilibrium is a joint strategy profile at which each agent myopically plays a best response to the other agents’ strategies, ignoring the possibility that deviating from the equilibrium could lead to an avalanche of successive changes by other agents. However, such changes could potentially be beneficial to the agent, creating incentive to act non-myopically, so as to take advantage of ...

متن کامل

Finite Population Dynamics and Mixed Equilibria

This paper examines the stability of mixed-strategy Nash equilibria of symmetric games, viewed as population profiles in dynamical systems with learning within a single, finite population. Alternative models of imitation and myopic best reply are considered under different assumptions on the speed of adjustment. It is found that two specific refinements of mixed Nash equilibria identify focal r...

متن کامل

Bayesian Learning in Normal Form Games

This paper studies myopic Bayesian learning processes for finite-player, finitestrategy normal form games. Initially, each player is presumed to know his own payoff function but not the payoff functions of the other players. Assuming that the common prior distribution of payoff functions satisfies independence across players, it is proved that the conditional distributions on strategies converg...

متن کامل

Fading memory learning in a class of forward-looking models with an application to hyperinflation dynamics

Ž We analyze a class of non-linear deterministic forward-looking economic models the . state today is affected by today’s and tomorrow’s expectation under bounded rationality learning. The learning mechanism proposed in this paper defines the expected state as a geometric average of past observations. We show that the memory of the learning process plays a stabilizing role: it enlarges the loca...

متن کامل

Framing reinforcement learning from human reward: Reward positivity, temporal discounting, episodicity, and performance

Several studies have demonstrated that reward from a human trainer can be a powerful feedback signal for control-learning algorithms. However, the space of algorithms for learning from such human reward has hitherto not been explored systematically. Using model-based reinforcement learning from human reward, this article investigates the problem of learning from human reward through six experim...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2004

Learning TOMs: Towards Non-Myopic Equilibria

نویسندگان

چکیده

منابع مشابه

Beyond myopic best response (in Cournot competition)

Finite Population Dynamics and Mixed Equilibria

Bayesian Learning in Normal Form Games

Fading memory learning in a class of forward-looking models with an application to hyperinflation dynamics

Framing reinforcement learning from human reward: Reward positivity, temporal discounting, episodicity, and performance

عنوان ژورنال:

اشتراک گذاری